Best k-Layer Neural Network Approximations
نویسندگان
چکیده
We show that the empirical risk minimization (ERM) problem for neural networks has no solution in general. Given a training set $$s_1, \ldots , s_n \in {\mathbb {R}}^p$$ with corresponding responses $$t_1,\ldots ,t_n {R}}^q$$ fitting k-layer network $$\nu _\theta : {R}}^p \rightarrow involves estimation of weights $$\theta {R}}^m$$ via an ERM: $$\begin{aligned} \inf _{\theta {R}}^m} \ \sum _{i=1}^n \Vert t_i - \nu (s_i) _2^2. \end{aligned}$$ even $$k = 2$$ this infimum is not attainable general common activations like ReLU, hyperbolic tangent, and sigmoid functions. In addition, we deduce if one attempts to minimize such loss function event when its attainable, it necessarily results values $$ diverging $$\pm \infty . will smooth $$\sigma (x)= 1/\bigl (1 + \exp (-x)\bigr )$$ (x)=\tanh (x)$$ failure attain can happen on positive-measured subset responses. For ReLU activation (x)=\max (0,x)$$ completely classify cases where ERM best two-layer approximation attains infimum. recent applications networks, overfitting commonplace, avoided by ensuring system equations $$t_i (s_i)$$ $$i =1,\ldots ,n$$ solution. ReLU-activated network, generically, i.e., be fitted perfectly probability one.
منابع مشابه
Gradient-enhanced Neural Network Response Surface Approximations
An approach to develop response surface approximations based upon artificial neural networks trained using both state and sensitivity information is described in this paper. Compared to previous approaches, this approach does not require weighting the residuals of the targets and gradients and is able to approximate gradient-consistent response surfaces with a relatively compact network archite...
متن کاملAdaptive friction compensation using neural network approximations
We present a new compensation technique for a friction model, which captures problematic friction effects such as Stribeck effects, hysteresis, stick-slip limit cycling, pre-sliding displacement and rising static friction. The proposed control utilizes a PD control structure and an adaptive estimate of the friction force. Specifically, a radial basis function (RBF) is used to compensate the eff...
متن کاملBest subspace tensor approximations
In many applications such as data compression, imaging or genomic data analysis, it is important to approximate a given tensor by a tensor that is sparsely representable. For matrices, i.e. 2-tensors, such a representation can be obtained via the singular value decomposition which allows to compute the best rank k approximations. For t-tensors with t > 2 many generalizations of the singular val...
متن کاملUsing Artificial Neural Network Approximations Testing for structural breaks in nonlinear dynamic models using artificial neural network approximations
In this paper we suggest a number of statistical tests based on neural network models, that are designed to be powerful against structural breaks in otherwise stationary time series processes while allowing for a variety of nonlinear specifications for the dynamic model underlying them. It is clear that in the presence of nonlinearity standard tests of structural breaks for linear models may no...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Constructive Approximation
سال: 2021
ISSN: ['0176-4276', '1432-0940']
DOI: https://doi.org/10.1007/s00365-021-09545-2